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Abstract 

Particle  filtering  is  a  powerful  approximation  method  that  applies  to  state  estimation  in  nonlinear  and 
non-Gaussian  dynamical  state-space  models.  Unfortunately,  the  approximation  error  depends  exponentially 
on  the  system  dimension.  This  means  that  an  incredibly  large  number  of  particles  may  be  needed  to  appro¬ 
priately  control  the  error  in  very  large  scale  filtering  problems.  The  computational  burden  required  is  often 
prohibitive  in  practice.  Rebeschini  and  Van  Handel  (2013)  analyse  a  new  approach  for  particle  filtering  in 
large-scale  dynamic  random  fields.  Through  a  suitable  localisation  operation  they  reduce  the  dependence 
of  the  error  to  the  size  of  local  sets,  each  of  which  may  be  considerably  smaller  than  the  dimension  of  the 
original  system.  The  drawback  is  that  this  localisation  operation  introduces  a  bias.  In  this  work,  we  propose 
a  modified  version  of  Rebeschini  and  Van  Handel’s  blocked  particle  filter.  We  introduce  a  new  degree  of  free¬ 
dom  allowing  us  to  reduce  the  bias.  We  do  this  by  enlarging  the  space  during  the  update  phase  and  thus 
reducing  the  amount  of  dependent  information  thrown  away  due  to  localisation.  By  designing  an  appropri¬ 
ate  tradeoff  between  the  various  tuning  parameters  it  is  possible  to  reduce  the  total  error  bound  via  allowing 
a  temporary  enlargement  of  the  update  operator  without  really  increasing  the  overall  computational  burden. 


1  Introduction 

Recursive  Bayesian  estimation  (or  filtering)  is  a  technique  for  recursively  estimating  the  state  of  a  random  pro¬ 
cess  observed  via  noisy  measurements.  If  the  underlying  dynamical  model  is  linear  and  Gaussian  we  have 
the  celebrated  Kalman  filter  nu  which  is  an  exact  solution  to  the  Bayesian  filtering  problem.  Unfortunately, 
in  many  practical  scenarios  of  interest,  the  Bayes  filter  is  not  exactly  computable.  Therefore,  we  seek  tech¬ 
niques  to  approximate  this  ideal  filter.  The  Kalman  filter  can  be  applied  in  more  general  settings  l9l[TTl  as  an 
approximation.  Particle  filtering  is  a  more  general  approximation  method  that  is  easily  applied  to  nonlinear 
and  non-Gaussian  state-space  models.  The  particle  filter  approximates  the  Bayesian  filter  via  Monte  Carlo 
simulation/ sampling.  The  samples  (or  particles)  are  propagated  through  a  sequential  importance  sampling 
mechanism  that  attempts  to  capture  the  dynamics  of  the  unobservable  process  and  the  likelihood  of  the  ob¬ 
servations  available.  Other  approximations  exist  such  as  Gaussian  mixture  filters  etc.  (9  /1!]. 

The  particle  filter  has  been  widely  studied  in  theory  and  in  countless  practical  applications  l8llT3lf71.  In  (6) 
the  authors  prove  that  the  error  can  be  controlled  uniformly  in  time,  thus  providing  a  solid  mathematical  sup¬ 
port  for  application  of  the  filter  in  numerous  fields.  Unfortunately,  the  particle  filter  computation  is  strongly 
dependent  on  the  dimension  of  the  underlying  estimation  problem.  Specifically,  the  error  bound  grows  ex¬ 
ponentially  with  the  system’s  dimension,  making  the  filter  infeasible  in  most  high- dimensional  applications. 
This  problem  is  known  as  the  curse  of  dimensionality  (T4l.  A  heuristic  explanation  of  this  phenomenon  for  a 
particular  case  can  be  found  in  (17) .  In  (5][T4)  the  authors  give  a  precise  relation  between  the  dimension  of  the 
system  and  the  number  of  particle  required  to  avoid  weight  degeneracy  (7).  The  fact  that  the  approximation 
error  is  exponential  in  the  dimension  and  only  inversely  controlled  by  the  sample  size  implies  that  an  incredi¬ 
bly  large  number  of  particles  are  required  when  dealing  with  a  high  dimensional  system  if  we  want  to  control 
the  error  at  a  reasonable  level.  Obviously,  a  large  number  of  particles  means  a  heavy  computational  burden, 
that  is  often  simply  prohibitive. 

Recent  studies  [3l  l4l  IT5l  [16l  however  suggest  that  high-dimensional  particle  filtering  may  be  feasible  in 
particular  applications  and/or  if  one  is  willing  to  accept  a  degree  of  systematic  bias.  In  (3),  the  particle  filter 
is  applied  in  a  static  setting  where  the  objective  is  to  sample  from  some  high- dimensional  target  distribution. 
In  this  case,  through  a  sequence  of  intermediate  and  simpler  distributions,  it  is  shown  that  the  particle  filter 
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will  converge  to  a  sampled  representation  of  the  target  distribution  with  a  typical  Monte  Carlo  error  (inverse 
in  the  number  of  particles)  given  a  complexity  on  the  order  of  the  dimension  squared.  Although  (3)  deals  only, 
in  essence,  with  a  static  problem  of  sampling  from  a  fixed  target  distribution,  the  analysis  introduces  a  novel 
way  of  thinking  about  high- dimensional  particle  filtering  which  may  carry  over  to  dynamic  filtering  problems. 
Related  work  appears  in  (3 . 

1 . 1  Background:  The  Motivating  Paper 

In  [T5]  the  authors  consider  particle  filtering  in  large-scale  dynamic  random  fields.  They  assume  the  dynamics 
of  the  underlying  process  are  localised  to  a  neighbourhood  of  the  field  and  the  observations  are  local  to  each 
site.  They  exploit  this  idea  by  localising  the  algorithm  during  the  update  phase.  They  argue  that  the  difficulty  in 
high  dimensional  particle  filtering  is  due  largely  to  the  dimension  of  the  observation  and  the  nonlinearity  of  the 
update  operation.  Therefore,  they  partition  the  field  into  independent  blocks  and  correct  every  marginalised 
block  separately.  The  posterior  is  simply  the  product  of  the  blocked  marginals.  The  real  contribution  of  [15 ] 
is  a  descriptive  and  technical  analysis  that  shows  the  error  introduced  due  to  the  localisation  procedure  can 
be  readily  controlled  if  the  dynamics  of  the  random  field  at  each  site  are  only  locally  dependent  on  those  sites 
within  close  proximity.  The  standard  sampling  approximation  error  is  shown  to  be  exponential  in  only  the 
size  of  the  individual  blocks.  The  number  of  samples/particles  controls  the  sampling  approximation  error  at 
the  typical  rate  while  the  error  due  to  the  localisation  process  is  a  systematic  bias  that  can  only  be  controlled 
through  an  increase  in  the  block  size.  Since  each  block  is  updated  independently,  parallel  implementation  is 
readily  applicable  and  the  computational  burden  may  be  alleviated,  albeit  this  remains  to  be  seen  in  practice. 
While  the  results  of  (T5)  are  at  the  proof-of-concept  stage,  the  idea  is  incredibly  powerful. 

The  authors  in  |  El  show  that  although  the  total  approximation  error  can  be  controlled  uniformly  in  time, 
it  suffers  from  a  spatial  inhomogeneity.  Specifically,  the  nodes  close  to  the  block  boundaries  display  a  larger 
error  than  those  far  removed  from  the  boundaries  (as  one  might  expect).  A  simple  approach  to  average  this 
spatial  inhomogeneity  is  given  in  [2]  where  adaptive  partitioning  of  the  field  is  employed. 

1.2  Contribution 

In  this  paper  we  consider  again  the  idea  proposed  in  [15]  and  propose  a  modified  particle  filtering  algorithm 
that  displays  an  additional  degree  of  freedom.  The  idea  proposed  herein  is  to  enlarge  the  blocks  during  the 
update  phase,  allowing  for  more  observations  to  be  employed  during  the  correction  at  each  block.  The  main 
contribution  is  the  addition  of  a  new  parameter  that  captures  how  much  we  enlarge  each  block  prior  to  the 
update.  Obviously,  by  enlarging  each  block  prior  to  updating  we  reduce  the  bias  error  but  we  increase  the 
complexity  involved  in  updating  each  (enlarged)  block.  By  designing  an  appropriate  tradeoff  between  the 
various  tuning  parameters  it  is  possible  to  reduce  the  total  error  bound  via  allowing  a  temporary  enlargement 
of  the  update  operator  without  increasing  the  overall  computational  burden. 

2  Problem  Setup  and  Applications  of  the  Blocked  Filter 

We  borrow  the  problem  setup  and  notation  directly  from  p  5] . 

Consider  a  Markov  chain  ( Xn)n>o  defined  on  a  Polish  state  space  X  with  transition  density  p  :  X  x  X  — ►  IR 
with  respect  to  a  reference  measure  (//.  Moreover  consider  a  process  (Yn)n> o,  defined  on  a  Polish  space  Y, 
conditionally  independent  given  (Xn)n>o,  with  a  transition  density  g:XxY^H  with  respect  to  a  measure  <p. 
The  process  ( Xn)n>o  is  observed  via  the  process  {Yn)n> o.  Our  aim  is  to  estimate  the  probability  of  the  state  Xn 
given  the  measurements  up  to  that  time  and  the  initial  condition  p.  Therefore  we  introduce  the  filter 


^:=Pm[X„e-|Fi,  -,Y„] 


It  can  be  easily  seen,  using  Bayes  rule,  that  the  filter  can  be  written  in  a  recursive  way 


where  the  operator  F„  is  defined  as  follows 

flA[x)p(xo,x)g{x,  Yn)y/{dx)p{dx0) 
 f  p{xo,x)g{x,  Yn)y/(dx)p(dx0) 

Moreover,  the  above  operator  is  typically  split  into  two  sub-steps  Fn  -  C„P  where 


(P p)(A):=  I  lA(x)p(x0,x)y/{dx)p{dx0) 


is  a  prediction  step,  and 


(C„p)(A) 


flA(x)g(x,  Yn)p{dx) 
f  g(x,  Yn)p(dx) 


is  a  correction  (or  update)  step.  In  the  prediction  step,  the  measure  is  transformed  according  to  the  density 
p[-,  •),  while  in  the  update  step  we  use  the  new  information  Yn  to  correct  the  predicted  measure.  We  then  write 
the  recursion  as  follows 


prediction 


n— 1 


-=pn» 
n  n-  ■  r  n—\ 


correction  a 
- * 


Cn7r 


n— 


The  classic  bootstrap  particle  filter  uses  N  particles  (or  samples)  to  approximate  the  measure  Given  a 
sampled  approximation  of  7i^_v  the  particles  are  first  moved  according  to  the  transition  /?(•,•)  in  order  to 
approximate  a  sampled  representation  of  the  prediction.  The  update  then  computes  a  weighted  posterior 
empirical  measure  via  g(-,-)-  Eventually,  a  resample  step  is  added  in  order  to  avoid  weight  degeneracy  (7] . 
More  formally,  denoting  the  bootstrap  filter  by  A„,  we  have  A^  =  P nA^l_\  where  Fn  =  C„S  VP  and  5N  represents 
the  sampling  operator  here  defined 


1  N 

S Np  :=  —  £  SxU) ,  x{i)  is  i.i.d.  ~  p 


It  is  possible  to  prove  that 

sup  E [7T^ (/)  -  (/) ]  <  ao/y/N 

l/lsl 

with  «o  independent  of  time.  Unfortunately,  the  constant  c  typically  depends  (exponentially)  on  the  dimen¬ 
sion  of  the  underlying  problem.  Intuition  for  this  exponential  dependence  is  given  in  fT5HT7). 

We  now  consider  the  pair  ( Xn ,  Yn)  as  a  random  field  (X",  Y") u€_z  indexed  on  a  finite  undirected  graph 
G  -  ( V,  E) .  The  vertex  set  V  will  represents  the  collection  of  sites  and  the  edge  set  E  the  spatial  relationships 
between  them.  The  cardinality  of  V  captures,  in  some  sense,  the  dimension  of  interest.  More  formally,  the 
spaces  X  and  Y  are  defined  as  products  X  :=  Ilyev  Xv,  The  reference  measures  are  products  y/ 

V/U ,<(>  :=  ® v  (p v ,  where  y/v  and  <p v  are  reference  measures  on  Xv  and  Y"  respectively.  The  transition 
densities  are  defined  as 

p(x,z)  :=  Pv(x,zv),  g(x,y )  :=  f]  gv{xv,yv) 

V£V  V£V 

where  pv  :  X  x  Xv  — ►  IR  and  gu  :  X"  x  Y"  ^  H  are  densities  with  respect  to  the  reference  measures  i//"  and  <pv . 
From  the  definition  we  can  see  that  the  observations  Yn  are  assumed  to  be  completely  local,  in  the  sense  that 
Y%  depends  uniquely  on  the  value  assumed  by  X^.  The  process  (Xn)n>o  is  local  in  the  sense  that  the  state  at  a 
site  u  depends  only  on  the  state  at  nearby  sites.  We  state  this  formally.  Consider  the  graph  G  equipped  with  the 
distance  d(v,  v')  defined  by  the  number  of  hops  along  the  shortest  path  connecting  u  and  v' .  We  can  define 
the  neighbourhood  of  a  site  v  as 

jV(v):={ v'eV  :  d{v,v')<r] 

where  r  represents  the  range  of  interaction.  Then  we  assume 

pv{xl,zv)^pv{x2,zv)  whenever  x?w  =  xf 'v) 

where  we  write  for  I Q  V,  x1  =  ( x*)/ej .  In  other  words,  the  random  field  (X, ,)„>()  is  local  in  the  sense  that  given 
X() , ,  Xn-i  the  present  state  X"  depends  only  on  X'^p1 . 


2. 1  Blocked  Particle  Filter 

In  ||T5]  the  authors  propose  an  application  of  the  blocked  filter  algorithm  to  the  field  model  just  explained, 
exploiting  the  local  dynamic  dependencies.  We  briefly  illustrate  this  algorithm.  Consider  a  partition  Jff  =  { K, } 
of  V  into  non-overlapping  blocks  with  a  union  equal  to  V.  The  idea  is  to  create  independence  across  blocks 
on  V  by  marginalising  after  the  prediction  step.  We  then  update  each  block  separately  and  finally  we  form 
via  the  product  of  the  independent  (updated)  blocked  marginals.  More  formally,  consider  the  block  operator 
B  on  the  space  M  (X)  of  measures  on  X,  defined  by 

B p:=  <g)  B Kp 

KeJ? 

where  B Kp  is  the  marginal  of  the  measure  p  on  the  subset  K  Q  V .  Then  the  proposed  block  filter  can  be  written 
as  a  recursion  Aq  -  p,  A„  -  j  where  the  operator  Fn  -  C„  BS,VP  consists  of  four  steps 


n—l 


prediction  /  sampling 


jjX  =  sN  PA1* 

11  n-  ^  rnn-l 


blocking/  correction 


^  jU  _  /~  r-^  s.  fl 

7Tn  — 


We  make  the  following  definition. 

Definition  1.  Given  p,  v  e  .M  (X)  and  a  subset  I Q  V  we  define  a  distance  of  the  marginals  on  I  as  follows 

lllju-v[||/:=  sup  E[|p(/)-v(/)|2]5 

/eM(Xfi:|/[<l 

where  the  expectation  is  taken  with  respect  to  the  random  sampling  and  MIX1)  is  the  class  of  measurable  func¬ 
tion  on  X  that  depends  only  on  the  values  on  I,  that  is  fix')  —  fly)  when  x1  -  y1 .  If  I  -  V  we  omit  the  subscript 
and  write  |||  yu  —  v|||. 

With  no  expectation  it  follows  that  |||  •  |||  is  equivalent  to  the  total  variation  which  we  write  as  ||  •  II  •  The  two 
norms  are  interchangeable  when  no  sampling  occurs. 

Now,  given  a  set  I  c  V  we  define  the  boundary  and  the  interior 

dI:={vEl\Nlv)£I},  int(7  ):=I\dI 


and  given  a  partition  Jd ,  we  define  the  following  quantities 


A  :=  max|JV(i/)| 
veV 

\JT\oo  :=  max  | | 

Ke  JT 

Ajt  :=  max \{K'eJT:  dlK, K')  <  r} | 

Ke  JT 

where  the  first  quantity  is  independent  of  the  partition.  The  result  proven  in  [15]  is  the  following. 

Theorem  1  (Blocked  Particle  Filter  fl5ll.  There  exists  a  constant  0  <  £q  <  1 ,  depending  only  on  the  quantities 
A,  Ajjr  such  that  if  there  exists  eq  <  e  <  1  and  0  <  k  <  1  such  that 

£<pvlx,zv)<£-\  k  <  gvlxv  ,yv)  <  k~1  Vx.zeX,  yeY,  veV 


then  for  every  xeX,  n>0,  KeJAI  and  I QK  we  have 


In n- a nh  ^ 


a 


e-Pid(I,dK )  + 


Vn 


where  the  constants  a,  p\ ,  p2  are  positive,  finite  and  dependent  only  on  A,  Aj %,£,  k  and  r. 


The  intuition  is  that  the  algorithm  approximation  error  is  exponential  in  |i*T|  rather  then  in  |  V\  but  that  the 
error  at  some  individual  locations  increases  with  the  proximity  of  those  locations  to  the  border  of  the  blocks. 
This  leads  to  a  spatial  inhomogeneity  as  seen  in  the  first  term  of  the  bound. 


2.2  Adaptively  Blocked  Particle  Filter 


A  first  attempt  to  achieve  a  spatially  homogeneous  error  bound  can  be  found  in  |2).  The  idea  is  to  consider  a 
finite  number  m  of  partitions  ■%[  and  to  apply  them  cyclically.  Clearly  we  have  to  choose  the  partitions  is  such 
a  way  there  is  no  node  that  is  consistently  close  to  a  border.  This  condition  is  expressed  by  a  bound  on  the 
average,  or  exponential  average,  of  the  border  distance.  Given  p  >  0  write 

1  m— 1  i  m— 1 

0<Omlv)  =  —  £  dlv.dKjlv))  0  <<Mi/)  =  —  £  e-MivMjW) 
m  ;to  m  p0 


Clearly  9  and  (p  represent  how  well  balanced  the  collection  of  partitions  are.  Define  A^  ( v) maxs  d  ( v,  dKs) 
and  Vrf(f)  minsdtn.dTfs). 

Theorem  2  ([2j).  There  exists  a  constant  0  <  t'n  <  1 ,  depending  only  on  the  quantities  A,  A^  such  that  if  there 
exists  £o  <  £  <  1  and  0  <  k  <  1  such  that 

e  <  pl'lx,zv)  <  £-1,  jc  <  gvlxv,yv)  <  7C_1  Mx,z  e  X,  y  e  Y,  v  e  V 

then  for  eveiy  x  e  X,  n  >  0  and  veV  we  have 


1  m-l  I  TS\  pPIJSloo, 

-EK-rO'  -  +  — ) 


<  a  e 


VN 


where  0  <  a,  p  <  oo  depend  only  on  £,  k,  r,  A  and  |  J?r|oo  :=  maxs  maxjf£jrs  \K\  in  this  case. 


If  6  =  dm(v)  —  d(v,dKj{v))  where  Kj ( u)  e  Xj  for  all  v  e  Y,  then  the  bound  is  completely  spatially 

invariant.  See  (2)  for  further  discussion  on  this  method. 


3  Enlarged  Blocked  Particle  Filtering 

Suppose  now  we  are  given  a  partition  X  over  V  but  it  turns  out  we  are  interested  only  in  estimating  the 
marginal  of  on  a  particular  block  K  e  ,X .  We  could  first  redefine  the  partition  with  a  larger  block  encom¬ 
passing  K  and  a  bunch  of  single  site  blocks  (to  speed  up  the  overall  computation).  It  is  of  course  not  possible 
to  define  a  partition  in  this  manner  for  multiple  blocks  of  interest.  However,  the  idea  proposed  here  is  based 
on  extending  the  state  space  by  creating  multiple  independent  copies  of  the  measurements  (and  states)  that 
are  then  used  in  different  (and  independent)  enlarged  blocks. 

We  introduce  some  new  notation.  Consider  a  parameter  b  >  0,  that  we  will  consider  fixed  throughout  the 
rest  of  the  paper.  Then  define,  for  any  K  e  -X,  an  enlarged  block 

K:=lveV\d(v,K)<b} 


Now  define  the  enlarged  spaces 


XE:=  f]  flX",  Y£:=  ]“[  11  Y" 

JCeJfT  v€k  KeJE  v£k 


Consider  the  collection  X  —  {K  :  K  e  X}.  This  is  no  longer  a  partition  of  V.  However,  -X  is  a  partition  on  XE, 
and  here  we  can  apply  the  blocking  and  updating  operators  associated  with  X .  We  use  the  superscript  E  to 
note  enlarged  objects.  The  measures  t///:  and  cpE  are  defined  straightforwardly.  The  block  operator  becomes 

BE  ■,X(X)^X{XE),  B£(p):=  (g)  B Kp 


To  update,  we  need  the  same  operator  Cn  redefined  on  the  new  space  M (%,:), 

JIaM  II  v£Xe  gv(xv,Yn)  pidx) 


(C»(A):= 


fUve\Egv(.xv,Yn)p(dx) 


We  also  define 


B_1(p)  :=  0  B^p 

KejE 


B-1  :X{XE)  - 

Now  we  can  write  the  enlarged  blocked  filter  algorithm  as  a  recursion 

^0  <  =  F  nxn_x  («>1) 

where  FE  B_1C£B£SWP.  Now  we  have  five  steps.  Skipping  the  prediction/ sample  steps,  graphically  we  have 


sji  cAfnif  enlarging/blocking  updating  rEpE marginalizing  ^ 

T n—  —  ^  '^n— 1  *  ^  ^n—  *  ^ n—  *  ^ n 


To  write  out  the  explicit  expression  of  the  filter  we  note  that 

(B_1Cf  B£Pv)(A)  =  (Cf  B£Pv)(A£) 

where  AE  A  x  (X£\X).  Therefore,  splitting  a  variable  z  e  X£  in  z  =  ( x,ze )  with  x  e  X  and  ze  g  Xe\X  (where 
now  we  put  E  as  subscript  just  for  notational  simplicity)  and  an  enlarged  block  K  =  (K,  Kl:)  where  K1  -  {v  e 
K  :  v  t  K],  we  can  write 


(Ffv)C4)  = 


f  Iae[z)Y[K'zjc  [nu;eF  Pw(x0lzw)  gw{zw,  Ysw)  V{dx01y/K'  [dzK')j 


IUk'£JE  [nu,eF  pw{x0,xw)  gw{xw,Ysw)  v{dx0)y/K'  (dzK'j 

f  \ V\u>eK'  pw(x0,xw)gw(xw,  YSW)UW£K'E  Pw(xo,z%)gw(z%,  Ysw)v(dx0)y/K' (dxK')y/K'E [dz%E) 

fUK'ejE  \UwtK'  pw(xo,xw)  gw[xw,  Ysw)  UW£K'E  Pw{xo,Zg)  gw{z™,  Ysw)  v(dx0)y/K' (dxK')y/K'E {dzf  ’E) ] 


where  for  /  Q  V  we  write  xjj^dx1)  -  Y[v£jysv  (dxv). 


4  Main  Results  and  Discussion 


Define  an  ideal  enlarged  blocked  filter  ft„  =  F„ . . .  Fi/i  where  Fs  :=  B  1  C(  B,;  P.  Fix  I  <Z  V .  We  then  use  the 
triangle  inequality  to  decompose  the  error  according  to 

I Un-fin  Hit  ^  III  fin  ~  ^-n  III  /  +  III  %n  ~  III  I 


where  we  refer  to  the  first  and  second  decomposed  terms  as  the  bias  and  variance  respectively.  The  bias  rep¬ 
resents  the  error  introduced  solely  as  a  result  of  the  blocking  operation.  In  the  standard  bootstrap  filter,  this 
bias  term  vanishes  and  the  typical  analysis  considers  only  the  variance  term. 

Going  forward,  we  consider  bounding  both  the  bias  and  the  variance.  We  stress  however,  that  the  bias  is 
fundamentally  more  interesting  as  it  pertains  directly  to  the  localisation  idea  considered  herein.  Indeed,  the 
sampling  operation  that  leads  to  the  variance  term  could  be  replaced  with  other  approximation  techniques 
with  no  loss  of  generality  (albeit  a  different  approximation  error  than  detailed  subsequently). 

For  sake  of  completeness /clarity  we  firstly  state  a  result  that  includes  both  a  bias  and  a  variance  bound. 

Theorem  3  (Main  result).  Suppose  there  exists  a  constant  0  <  £q  <  1 ,  depending  only  on  A  and  Ajr  and  assume 

E  <  pv{x,zv)  <  e~l,  K  <  gv{xv  ,y1')  <K~l  Mx,Z  E  X,  y  E  Y,  V  E  V 


Then  for  every  time  n  >  0,  x  e  X,  K  e  JT  and  I Q  K  we  have 

11^-^111/  <  a  + 


_  ePz 


s/N 


where  the  constants  0  <  a,  pi ,  f>2  <  oo  depend  only  on  e,  k,  r,  A,  Aj^ ,  A—. 

This  single  (total  error)  bound  is  derived  in  practice  as  two  separate  bounds  which  we  now  explicitly  state. 
Theorem  4  (Bounding  the  Bias).  Assume  there  exists  0  <  e  <  1  such  that 

e  <  p"(x,  z")  <  e-1  for  all  y  e  V,  x,zeX 


and  such  that 

Y  \  1/2A 

18A2  J 

Let  f  —  —  (2r) — 1  logl8A2(l  -  e2A)  >  0.  Then  for  every  n  >  0  we  have 

111^-7^111/  <  — g(l - £2A)\I\e~^d^I,a^ 

1  -  e~P 

for  every  xeX,KeJT  and  I Q  K. 

The  only  difference  between  this  bias  bound  and  the  bias  bound  in  |T5]  is  the  presence  of  \.Af\rxi  in  place  of 
\-Af\rxj-  For  a  given  partition  Jff  any  enlargement  of  the  blocks  in  JT  yielding  JT  results  in  a  tighter  bias  bound 
as  expected. 

Theorem  5  (Bounding  the  Variance).  Assume  there  exists  0  <  k ,  e  <  1  such  that 

£<  pl\x,zv)  <e~1,  k  <  gv(xv,yv)  <  xV1  for  all  ve  V,  x,zeX,  ye  V 


e  >  £o  =  1 1 


and  such  that 

j  \  1/2A 

6A^A2J 

Let  p-  -log6A^A2(l  -  e2A)  >  0  where  A-^:-  maxIlkT  e  :  d(K,K)  <  r}|.  Then  for  every  n  >  0  we  have 


£  >  £o  =  1 


II  7t »  —  7t  m  || 


<  |/| 


64Aj^  £  x 


1  -  e~P 


Vn 


for  every  x  e  X,  K  e  JT  and  I  qK. 

Again,  the  only  significant  difference  between  this  variance  bound  and  the  variance  bound  in  |T5  |  is  the 
presence  of  |  JITloo  in  place  of  The  variance  depends  inversely  on  the  number  of  samples  and  exponen¬ 

tially  in  the  size  of  the  enlarged  blocks. 


4. 1  How  to  Use  the  Enlarged  Blocked  Filter 

Roughly,  we  now  explain  how  one  may  implement  the  enlarged  blocked  filter  to  reduce  the  bias  as  compared 
with  the  algorithm  proposed  in  (15)  while  maintaining  a  comparable  variance  and  computational  complexity. 

Suppose  firstly  that  one  has  a  random  field  over  \V\  sites  and  the  computational  power  available  (defining 
a  bound  on  IV)  ensures  that  blocks  of  size  \  V\/k  can  be  readily  handled  for  some  k>  0.  Then  the  complexity  of 
the  blocked  particle  filter  proposed  in  (151  can,  in  a  sense,  be  regarded  as  being  of  order  0{kN).  Really,  one  can 
imagine  k  particle  filters  running  in  parallel  over  each  block  and  each  with  complexity  on  the  order  of  0{N). 

To  exploit  the  enlarged  blocked  particle  filter,  one  should  start  with  a  larger  number  c>k  of  smaller  blocks 
which  when  enlarged  are  mostly  of  the  size  \V\lk.  Then,  the  complexity  of  the  enlarged  blocked  particle  filter 
proposed  herein  is  on  the  order  0(cN) .  One  immediately  sees  that  the  variance  of  the  enlarged  blocked  particle 
filter  is  mostly  on  the  same  order  as  that  of  the  algorithm  proposed  in  (T5]  and  the  computational  complexity 
has  only  increased  linearly.  However,  in  almost  all  cases  (and  certainly  with  well-designed  partitions)  one  will 
achieve  a  reduction  in  the  bias  at  any  given  site  in  the  random  field. 

4.2  Spatial  Homogeneity 

We  consider  a  special  but  interesting  case  in  which  a  spatial  homogeneous  total  error  bound  is  obtained,  the 
bias  bound  is  better  (tighter)  than  in  (15) ,  and  the  computational  requirements  largely  unchanged  when  com¬ 
pared  with  the  algorithm  in  fl5l. 

Corollary  1.  Assume  the  same  hypothesis  of  Theorem^  Consider  the  partition  J6  =  {v}v^y  and  suppose  b  >  r. 
Then  for  every  n>  0,  x  e  X,  and  v  e  V  we  have 

lll7r«-7t«IL  ^  — 7j  (1- e2A)e“^(fc_r) 

1  -  e~P 

This  bound  is  spatially  homogeneous  and  with  b  >  r  it  is  strictly  less  than  the  bias  bound  introduced  in  [15] . 
Note  that  while  the  bias  bound  here  is  spatially  homogeneous,  the  actual  bias  may  still  be  inhomogeneous 
since  this  result  is  potentially  based  on  over  bounding.  On  the  other  hand,  it  is  possible  to  apply  the  adaptive 
scheme  proposed  in  (2]  with  the  enlarged  blocked  filter  and  potentially  achieve  true  spatial  homogeneity. 

4.3  Discussion  on  the  Enlarged  Blocked  Filter 

The  idea  of  the  enlarged  blocked  particle  filter  is  essentially  based  on  the  principle  that  larger  blocks  lead  to  a 
reduction  in  the  bias  introduced  due  to  blocking. 

So,  why  not  just  start  with  larger  blocks? 

•  Well,  irrespective  of  the  size  of  the  blocks,  if  one  applies  the  standard  blocked  particle  filter  of  [15]  then 
there  will  always  exist  sites  on  the  border  of  a  block. 

•  If  we  extend  (or  enlarge)  the  blocks  as  proposed  herein,  we  (typically)  reduce  the  bias  at  each  site  (and 
particularly  those  sites  that  were  on  the  border  of  a  block  in  the  original  partition). 

•  If  we  increase  the  number  of  samples  N  with  a  fixed  number  of  larger  blocks  (in  the  original  partition) 
then  while  we  can  reduce  the  variance  we  have  no  effect  on  the  bias  for  those  sites  on  the  border. 

•  If  we  start  with  small  blocks  in  the  original  partition  and  then  simultaneously  enlarge  the  blocks  along 
with  the  number  of  samples  N  then  it  may  be  possible  maintain  a  given  variance  (or  even  reduce  the 
variance)  as  compared  to  a  partition  with  larger  original  block  sizes  but  with  a  guaranteed  smaller  bias 
at  each  site. 

The  high-level  point  is  that  it  is  computationally  more  desirable  to  run  a  few  extra  parallel  implementations 
of  the  particle  filter  (corresponding  to  more  (enlarged)  blocks)  and  obtain  a  tighter  bias  bound  than  it  is  to  run 
a  few  less  parallel  implementations  of  the  particle  filter  for  the  same  variance  bound  but  a  larger  bias  bound. 
This  is  only  possible  through  enlargement  of  the  blocks  as  described  herein. 

Finally,  we  comment  on  the  matter  of  consistency  (as  defined  in  say  [To])  and  observational  double  count¬ 
ing.  Consider  the  partition  JC  -  {v}v^v  and  suppose  K  -  V  for  each  K  =  v  £  J/T .  Practically,  following  the 
standard  prediction  step,  the  enlarged  blocked  filter  is  of  the  form  B  “ 1  Cj;  Brp  which  is  mathematically  equiv¬ 
alent  to  BC„p.  The  point  of  this  illustration  is  to  highlight  that  even  in  this  case,  involving  the  most  extreme 
enlargement  possible,  we  are  not  double  counting  information  or  effectively  applying  measurements  twice, 
and  the  enlarged  blocked  particle  filter  is  consistent  as  per  |[T0] . 


4.4  Proof  Strategy 

In  this  section  we  provide  a  summary  of  the  proof  strategy.  Clearly  the  main  result  in  Theorem[3]is  immediately 
implied  by  Theorems [4] and [5]  Much  of  the  technical  analysis  required  in  the  proof  of  Theorem[3]is  similar  to 
that  originally  detailed  in  [15]  • 

In  the  case  of  the  bias  |||  |||  j,  one  first  derives  a  local  stability  property  for  the  filter  which  implies 

that  the  marginal  over  a  local  setjQV  of  the  initial  state  p  is  forgotten  exponentially  fast.  Such  a  property  also 
implies  that  any  approximation  errors  in,  say,  the  initial  state  are  also  forgotten.  It  then  follows  that  if  one  can 
bound  the  one-step  approximation  error  |||  ~  |||  j  at  any  time,  then  in  conjunction  with  the  local 

stability  property  one  will  obtain  a  time-uniform  bound  on  the  bias  over  a  local  region  of  the  field. 

In  the  case  of  the  variance  |||  ft^  -  ft „  |||  /,  a  similar  idea  is  used  except  one  first  establishes  stability  for  the  ideal 
enlarged  blocked  filter  7t^.  Then,  one  must  bound  the  one-step  approximation  error  |||Fn7t^_1  -  Fnjr^_,  [||  j  at 
any  time.  Putting  the  stability  property  and  the  bound  on  the  one-step  approximation  together,  one  achieves 
the  desired  time-uniform  bound  on  the  variance  of  a  block  in  the  adaptively  blocked  filter. 

We  have  obviously  glossed  over  much  of  the  intricacies  involved  in  the  proof  in  this  summary.  For  exam¬ 
ple,  in  the  case  of  the  bias,  the  property  introduced  in  [15]  and  referred  to  as  the  decay  of  correlations  must  be 
established  to  hold  uniformly  in  time  for  the  ideal  block  filter  This  property  captures  a  notion  of  spatial 
stability  where  the  state  at  some  site  in  the  random  field  is  forgotten  as  one  moves  away  from  that  site.  Rebes- 
chini  et  al.  provide  a  novel  measure  of  this  decay  that  allows  them  to  establish  local  stability  of  the  filter  n „  and 
to  establish  a  bound  on  the  one-step  approximation  error  |||  j  1||/.  Conceptually,  a  property  like 

the  decay  of  correlations  is  necessary  to  establish  such  results. 

Summarising,  the  steps  needed  to  prove  the  bias  bound  are 

1 .  Proving  a  (local)  stability  result  for  the  ideal  Bayesian  filter; 

2.  Proving  that  a  desired  decay  of  correlation  property  holds  uniformly  in  time  for  the  measure  ftfp 

3.  Controlling  the  one  time-step  error  introduced  by  the  new  enlarged  blocked  filter; 

4.  Putting  all  these  results  together  and  finalising  Theorem[4] 

The  variance  analysis  follows  much  the  same  path  with  the  prime  difficulty  being  establishment  of  local 
stability  for  the  ideal  enlarged  blocked  filter.  Summarising  the  steps  involved  in  proving  the  variance  bound, 

1 .  Proving  a  local  stability  result  for  the  ideal  enlarged  blocked  filter; 

2.  Controlling  the  one  time-step  error  due  to  the  sampling  in  the  enlarged  blocked  particle  filter; 

3.  Putting  these  results  together  and  finalising  Theorem[5] 

The  proof  details  are  omitted  in  this  version  of  the  work  due  to  their  similarity  with  those  details  presented  in 
CGD,  but  are  available  upon  request. 


5  Concluding  Remarks 

We  have  presented  a  modified  version  of  the  blocked  particle  filter  originally  proposed  in  |T5] .  The  main  fea¬ 
ture  of  our  algorithm  is  that  we  add  a  new  parameter  that  can  be  tuned  to  decrease  the  bias  as  compared  to 
ED3-  The  high-level  argument  for  this  approach  is  that  it  is  computationally  more  desirable  to  run  a  few  extra 
parallel  implementations  of  the  particle  filter  (corresponding  to  more  (enlarged)  blocks)  and  obtain  a  tighter 
bias  bound  than  it  is  to  run  a  few  less  parallel  implementations  of  the  particle  filter  for  the  same  variance 
bound  but  a  larger  bias  bound.  This  gain  in  bias  reduction,  with  the  same  variance,  and  only  a  linear  increase 
in  the  computational  complexity,  is  only  possible  through  enlargement  of  the  blocks  as  described  herein. 

Finally,  we  also  point  out  that  the  same  adaptive  approach  to  changing  partitions  proposed  in  (2)  could  be 
applied  in  the  case  of  the  enlarged  blocked  filter  and  this  is  an  additional  method  for  spatial  smoothing  and 
may  be  of  interest  in  those  cases  in  which  the  underlying  model  is  time-varying. 
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